home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
QRZ! Ham Radio 9
/
QRZ Ham Radio Callsign Database - Volume 9.iso
/
pc
/
files
/
t_docs
/
rspf2.doc
< prev
next >
Wrap
Text File
|
1996-06-24
|
50KB
|
963 lines
Fred Goldstein k1io
goldstein@aim.dec.com
Version 2.0 3-june-1988
The Radio Shortest Path First Routing Algorithm (RSPF)
For DDN Internet Protocol over Amateur Packet Radio
** DRAFT ARCHITECTURE -- FOR COMMENT **
CONTENTS
I. Introduction
II. Acquisition of router-router adjacencies
III. Acquisition of end node adjacencies
IV. Link state propogation
V. The Shortest Path First Spanning Tree
Appendix: Router Parameters
I. Introduction
Amateur packet radio use of the Internet Protocol does not yet provide
all of the capabilities of other IP networks. One particular example
of this is IP packet routing. Existing versions of the AMPR IP code
make use of a static routing table. This requires human intervention
every time a new backbone path is added, and adds geographic constraints
to address assignment which do not exist on the ARPA Internet.
The core ARPAnet has implemented automatic routing based upon Dijkstra's
"SPF" (shortest path first) spanning tree algorithm. A similar
procedure can be applied to AMPRnet (Net 44). It is called Radio
Shortest Packet First (RSPF); this document outlines the RSPF protocol.
I.1. Elements of RSPF
The RSPF protocol is designed for use by network-layer routing nodes (IP
Gateways) in a packet radio network using the DDN Internet Protocol.
It is comprised of four major functions:
1) Acquisition of router-router adjacencies
2) Acquisition of end node adjacencies
3) Link state propogation
4) Spanning tree route decision making.
Its net result is the automatic maintenance of a least-cost routing
table for use by IP Routing. RSPF is optimized for the geographically
heirarchical addressing used in AMPRnet, but does not depend upon it.
I.2. Addressing convention
When an RSPF router communicates with an end node, it will typically
deal with a 32 bit IP address. Routers themselves, however, also
support node group addressing (fewer than 32 bits need match). This
follows the convention in the KA9Q IP routing program, which permits a
crude form of heirarchical addressing as well as allowing portable
operations to override the defaults. RSPF looks for the match (node or
node group) with the greatest number of matching bits. Only if the
number of bits matched is equal, then the lower cost path will be used.
(Thus a match to a full node address will override a "cheaper" path that
matches its "class C net" of 24 bits, which overrides a "class B net",
etc., noting that AMPRnet does not follow strict 8-bit address
classification, and is a single Class A net.)
I.3. Connection-oriented vs. connectionless lower layers
IP is a datagram network protocol, and supports both connection-oriented
and connectionless lower layers. In a network with a high packet loss
rate, the local retransmission provided by a connection-oriented
datalink will substantially improve overall throughput. However, in a
high-speed dedicated backbone, particularly one implemented using
full-duplex radio or wireline links, connectionless may provide better
overall performance. The choice of which to use is a local matter; RSPF
will work with both. The reliability of the routing information,
however, may be somewhat greater with connection-oriented datalink
procedures, since these will give more rapid indication of a physical
link failure.
I.4. Relationship to other protocols
The reliability of the network depends upon reasonably reliable
transmission of the routing update; hence, for non-broadcast procedures,
it is recommended that routers communicate with one another using
connected-mode AX.25, or another reliable data link layer. (In any case
connected-AX.25 may be more useful than connectionless for backbone
links due to its local error correction ability.)
All packets specific to RSPF shall be exchanged inside IP packets using
a protocol identifier of <tbd>. Such IP packets shall be sent with a
time to live (TTL) value of 1. If broadcast procedures are used,
connectionless AX.25 UI frames shall be sent, with a multicast address
"QSTRTR-0" in the AX.25 address and an IP address of 44.255.255.255. (A
router can also "read the mail" on passing traffic not addressed to it;
such procedures are for further study.)
Note that in this document, "subnetwork" and "data link" are synonymous,
and refer to the layer over which IP packets are exchanged.
II. Acquisition of router-router adjacencies
For RSPF to operate correctly, all routers must remain reasonably
current as to the state of the network at large. This is handled by
the propogation of "bulletin" messages among routers. End nodes need
not concern themselves with this; they will normally communicate
through one "designated" router at any given time, for all
(non-adjacent) destinations (not seen by ARP or other lower-layer
procedures).
All information propogated through the bulletin process begins with each
routers' maintenance of an adjacencies table. Each router's adjacency
table lists the following information for all other nodes, both routers
and end nodes, from which it directly receives packets over a subnetwork
(or data link):
Adjacent node IP address (i.e., 44.56.0.44)
Adjacent node datalink (AX.25) address (i.e., K1IO-0)
Datalink used (i.e., AX0)
Datalink cost (i.e., 4)
Number of packets heard since last RRH update (i.e., 50)
Packet sequence number in last RRH update (i.e., 12593)
Time of last RRH update (i.e., 2130).
II.1. Router-router hello
For the backbone to create its topology automatically, there must be a
way for routers to discover each other with minimal overhead. For
this purpose, a router-router hello (RRH) message is provided.
Periodically (as an initial suggestion, shortly before beginning to
propogate the periodic links state bulletin to known adjacencies), the
router sends out the RRH message to the layer 2 multicast address and IP
multicast address (i.e., 44.255.255.255) . RSPF makes no assumption of
reciprocity (that links are bidirectional), so receipt of an RRH packet
provides the receiver with information about a one-way (received)
adjacency.
II.2. Connection-oriented procedure
If a router uses connection-oriented datalink procedures to its own
adjacencies (recommended), then when a router receives this RRH
packet, it checks to see if it already has a link to its origin in its
own links table. If not, it waits a random period of time (initial
suggestion: within the range of 0 -> 10*(link's value of T1, DWAIT or
SLOTTIME - tbd)) and then attempts to establish an AX.25 connection with
the usual SABM; the router responds with the usual UA (link established)
or DM (link rejected).
If a two-way connection has been established, then both routers add the
link to their adjacency tables. While the existence of this route is
set reciprocally, the cost of each side of the route is set separately
for each side of the connection. Cost is transmitted in the link state
packet. Thus the adjacency between two routers is not actually used
until the first link state packet exchange has taken place.
Loss of an adjacency may be noted by the loss of the AX.25 connection.
When this occurs, the router removes the router from its adjacency table
and follows the "bad news" procedures outlined below for link state
propogation.
II.3. Connectionless procedure
If a router uses connectionless datalink procedures to its own
adjacencies (suitable to low-loss links), then when a router receives an
RRH packet, it checks to see if this adjacency is already in its
adjacency table. If not, then it is added.
Loss of an adjacency may be noted by timeout; if no RRH message is
received, and no frames have been received from the adjacent router for
a period of time (initial suggestion: slightly over twice the maximum
interval between RRH messages), then the adjacency becomes suspect.
The router should attempt to exchange a packet with the suspect
adjacency; if unsuccessful, the route is marked lost. It may also be
marked lost if other attempts to send data through that router fail.
(Exact procedures for further study.)
Table II-1. Coding of the RRH PDU.
1 2
|0 |8 |6 |4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RSPF Version #| Type (RRH) | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Full IP Address of sending router |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| last packet sent seq. # | flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| plaintext |...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Parameters--
An RSPF Router-Router Hello packet is sent within IP with a type
of <tbd>. Each RRH packet contains the following fields:
RSPF Version Number: Version number of this protocol (initially 1).
Type: Value of 3 for RRH.
Checksum: IP-style checksum
Address: Full IP address of sending router
Last packet sent sequence number: An integer (mod 65535)
incremented by 1 for every frame sent by the datalink layer.
This allows receiving entities using connectionless procedures
to use the automatic link quality measurement technique
described below.
Flags: The low-order bit is 1 if connectionless datalink is
preferred; 0 if connection-oriented is preferred. (Set by
system management based upon anticipated link quality.)
Other bits are reserved (sent 0).
Plaintext: An optional text message (length may be up to maximum
size remaining in datalink PDU).
II.4. Automatic link quality measurement
A connectionless link or subnetwork may have very reliable, or very
sporadic, performance. Since there is no procedure for ensuring the
reliability of packets sent over a connectionless link, a high rate of
packet loss may occur without being detected by the routers. If this
loss is high enough, another route may become a better choice; a high
enough packet loss rate may be enough to mark a link as "down".
Every router shall maintain a count of packets sent over each link.
Every time an RRH message is sent, it includes the current value of this
counter (modulo 65535). Every router also maintains, in its adjacency
table, a count of packets received from this adjacency since the last
RRH message and the last received value of that counter.
Upon receipt of an RRH message, the recipient router compares the value
of the received packet counter with the last received value in the
adjacency table. The difference (taking into account wrap-around at the
modulus) is compared with the number of packets received since the last
RRH message. (This works even if an RRH message is lost.) This packet
loss ratio is then used as a link quality metric. (Timestamp is used
to compute packet arrival rate.)
Connection-oriented data links presumably deliver 100% of attempted
packets. A high-quality connectionless link, such as Ethernet/LLC1, will
come close to this. However, single-frequency packet radio links are
prone to packet loss for several reasons, including hidden transmitters,
lack of collision detection, and rf interference. The packet loss ratio
is useful in setting link cost, and may also be used to determine
whether a link should use connectionless or connection-oriented
procedures.
If a router reports, in its link update packets, that a given link has a
cost of _n_, then its adjacencies (and only its adjacencies) may apply
the packet loss ratio to adjust the cost which they maintain in their
link state tables. These adjusted costs, rather than the received
costs, may then be propogated to other routers. Specific procedures
and parameters for this are for further study.
III. Acquisition of end node adjacencies
Three possible means of determining adjacencies to end nodes are the use
of connected-mode AX.25 links, the use of ARP, and the use of a
"wiretap" algorithm (see RFC981). Unless a connection mode Data Link
layer (with keepalive timers) is used, adjacent nodes may need to send
each other messages at regular intervals to ensure that the link is
still usable. A procedure is outlined below for routers and end nodes
to acquire knowledge of each other.
It is assumed that RSPF will not be activated in end nodes; this will
permit simpler versions of the IP software to be used. A node that has
RSPF support in its software but operates as an end node can also use
the router-router connection procedures and simply broadcast its
adjacency to the router in a one-entry bulletin with a horizon of one.
Such a node may also be simultaneously homed on two or more other
routers, unlike true end nodes whose traffic either bypasses RSPF (using
ARP) or arrives by way of its associated router.
If an end node knows the IP address of the router which will connect
it to the network at large, it establishes a connected-mode AX.25 (or
other data link layer) connection to the router; the presence of this
connection indicates that the node is reachable from that router,
which then adds it to its links table and subsequent bulletins. This
may, of course, require an ARP exchange in order to acquire the AX.25
(data link layer) address.
Alternately, the end node can simply use ARP and use connectionless
link procedures. In this case the router should assume that the end node
is available until either a rather lengthy timer expires, or the router
is unable to make an ARP contact after the ARP timer expires. (A loss
of reachability should not be inferred from the ARP timeout.)
Routers should periodically broadcast their availability (suggested
interval: every 15 minutes) with an AX.25 UI frame sent to QST-0 (the
AX.25 broadcast address). A human-readable "unproto" message may go
here, allowing individual operators to recognize routers and connect
as appropriate. (No specific PDU coding is provided, as the end
nodes do not use RSPF.)
A router may also choose to use "Promiscuous ARP" to provide service to
an end node which is attempting to connect with an IP address reachable
by the router. In such a case the router should wait an extra interval
after receiving the ARP request because the desired destination may
actually be directly reachable; ARP procedures may need to be modified
to provide this.
Another potential approach is for routers to simply listen to AX.25
traffic on the link and determine who is adjacent to whom. This is the
gist of the "wiretap" algorithm in RFC981, which also finds non-adjacent
nodes by taking advantage of the source routing found in AX.25 frames.
Integration of wiretap into RSPF is for further study.
IV. Link state propogation
IV.1. Optional multicast/broadcast
Packet radio is inherently a broadcast medium. Packet radio networks,
however, may be viewed as a collection of individual links which happen
to use a broadcast physical medium. It is also possible to exploit the
broadcast nature of the medium. RSPF link state propogation procedures
allow but do not require such multicasting. If the link uses
connectionless procedures for user data packet exchange, then broadcast
procedures should be used for link state packet exchange. The converse
may not necessarily be true: The threshhold of loss that militates
against connectionless transmission of user data may be more stringent
than that which call for non-broadcast transmission of link state
packets. (Details for further study.)
IV.2. Routing update bulletins
Routing updates are passed along from router to router via routing
update bulletins. Bulletin propogation is designed to guarantee that
every node within a given "horizon" receives every routing update
message sent out by a given node.
Every router originates information about changes in its own adjacencies,
as well as periodic retransissions of its entire list of adjacencies.
These messages are then propogated among other routers. The router whose
adjacency information is being broadcast is called the _reporting
router_; this may be some hops away from the routers which forward it to
their own adjacencies. Each reporting router's adjacency updates
contain a sequence number (and in some cases, a subsequence number).
These sequence numbers are generated by the reporting router and
propogated; they are not changed when a bulletin is relayed. They
are incrememted by 1 (modulo 65536) every time a new one is generated.
Bulletins may also carry incremental change information to previous
bulletins. These carry the same sequence number as the full bulletin to
which they are reporting incremental changes; each such bulletin has a
subsequence number. The subsequence number is zero for a complete
routing update bulletin.
Every bulletin also has a horizon value, set by the reporting router,
associated with each of its constituent links. (A given reporting
router may have more than one constituent link, if it is a multi-port
router.) Every time a bulletin is propogated, each horizon value is
decremented by 1. When it hits zero, the bulletin is not propogated
further. Note that for horizon purposes, a bulletin's individual
constituent links may have separate horizons; when a link's horizon hits
zero, other links' adjacency information from the same reporting router
may continue to be propogated.
Every router maintains within memory a routers table containing one
tuple for every other router on the network. This tuple contains the
following elements:
IP Address
Last received bulletin sequence number
Last received bulletin subsequence number
Timestamp for when the data was received.
This table is used to coordinate the receipt and transmission of
bulletins, using either broadcast or non-broadcast procedures.
IV.3. Flooding without congestion collapse
A bulletin from reporting router X (stating adjacencies seen by X) is
sent, initially by X, to every adjacent router A, which passes it along
to all of its own adjacencies B. The routing update bulletin (which is
loosely based on the Internet EGP (Exterior Gateway Protocol)) may
contain one or more routers' adjacency lists, to be forwarded to
adjacent routers. This "flooding" is controlled such that no reporting
router's adjacency update is sent more than once between the same pair
of routers. (A bulletin packet may actually concatenate multiple
reporting routers' adjacency information; each is numbered separately,
even if transmitted within the same packet. This is done to reduce
the overhead of short transmitted packets.)
After router A sends its bulletin to B, the recipient router B then
looks at the sequence number of each reporting router X's adjacency
message and the sequence number of the last received adjacency message
that originated from X. If the just-received bulletin is newer (higher
number), then it sends a positive acknowledgement to A, and joins in the
flooding to its own adjacencies. The horizon is, of course,
decremented.
If it has the same number, B checks the horizon left in the received
bulletin against the horizon left when it previously received the bulletin.
In the event that the latest bulletin received had a shorter (lower
number) horizon left than the earlier one, it simply acknowledges the
(redundant) message. But if the reporting router X's horizon left is
greater for the new copy of the bulletin, router B propogates it as if
it were a new bulletin. This way, if the bulletin happened to first
arrive via a roundabout path, it won't accidentally fail to reach the
intended end of its range.
If any router B receives from router A a bulletin (from any reporting
router X) that contains a lower sequence number than one that had
previously arrived at B, then it can be presumed to be "obsolete"
information. B replies by sending a bulletin to A with the latest link
state information for that node X. As a corrolary, a router may poll
for information about a given reporting router by sending a routing
update bulletin for that reporting router with a sequence number that is
lower than the latest one it actually has received. Said bulletin may
contain zero links' information. (Note that since the sequence
number is modular, a value of 0 is not correct for this function as 0
is higher than 65535; the "poll" number should be only slightly lower.)
IV.4. Non-broadcast bulletin propogation
A router may acquire routing information from adjacent routers via
point-to-point procedures which treat every adjacency as a separate
logical data link. (Such a procedure also works, of course, over
point-to-point links such as wirelines.) This tends to have the highest
reliability, since connection-oriented data link control procedures can
be used to ensure the accuracy and completeness of the data passed over
the link. It has the disadvantage of requiring separate transmission of
the same data to different adjacent nodes on the same channel.
IV.5. Broadcast bulletin propogation
When a router is adjacent to several other routers via the same
broadcast (i.e., packet radio) channel, retransmission of routing
bulletins to each adjacency makes additional use of bandwidth, which may
be a scarce resource. A broadcast procedure may be used to pass the
bulletin along that link. Note that broadcast propogation of bulletins
may be combined with non-broadcast propogation, on a link by link basis.
Although packet radio is a broadcast medium, it is not a perfect one:
The reliability of multicast packets is not as high as for wireline LANs.
This leads to the need for a unique broadcast "flooding" technique.
A routing update bulletin is broadcast to the Layer 2 multicast AX.25
address "QSTRTR-0" using the IP multicast address (in AMPRnet,
44.255.255.255). Individual recipient routers may or may not receive
the entire bulletin, since there is no acknowledgement possible.
In a non-broadcast message sent via a connection-oriented datalink, the
entire routing update bulletin can be assumed to have been received
intact. Thus, if a given reporting router has originated a new complete
list of its adjacencies (new sequence number, subsequence number equals
0), then any adjacency not included is presumed to have ceased to exist.
Such a presumption is not always possible when a bulletin was received
via unacknowledged broadcast, as the message might have been received in
part. This should not make the partially received bulletin unusable.
A bulletin is transmitted in one or more packets, each of which
constitutes a numbered fragment of the whole bulletin. Within the
transmitted bulletin, each individual reporting router's node-header
contains the number of links being reported on, and each link-header
contains the number of adjacencies being reported on. Since each IP
packet that makes up a bulletin contains a fragment number, it is also
possible to determine whether or not a fragment was lost.
In connection-oriented non-broadcast procedures, a routing update
bulletin (not a partial one with a subsequence number >0) provides a
complete list of adjacencies known to the sending node, except where the
horizon is exceeded. Absence of a previouly-known adjacency then
implies that the adjacency has been lost. However, that assumption does
not apply to fragmented bulletins received in part, which can occur with
broadcast procedures: If a fragment was lost before reaching the end of
a given reporting router's portion of the bulletin, then the absence of
a previously-known adjacency in the received bulletin does not mean that
the adjacency was lost.
RSPF procedures dictate that routing update bulletins with a subsequence
number of zero are presumed to be complete lists of adjacencies from
their originators; higher subsequence numbers represent incremental
changes only. In the broadcast procedures, a routing update bulletin
with a subsequence number of zero, if received only in part, is
treated as an incremental change bulletin.
Thus, the recipient compares the sequence number with the previously
received sequence number from that reporting router. If it is higher
than the previously received one, then its adjacencies are used. If
it was received in full, and had a subsequence number of 0, then its
previous adjacencies are replaced. If it was not received in full, or
if its subsequence number was not zero, then its previous adjacencies
are left intact unless explicitly changed by the received bulletin.
If a bulletin is received in full, then the routers table is updated
with the latest sequence and/or subsequence number and timestamp.
If it is received in part, then these entries are not updated. After
the bulletin has been completely transmitted, a recipient node which has
received a partial update from a reporting node may poll for that node's
latest information, by using the (now known to be obsolete) sequence
number for that router in its router table in a bulletin, with zero links
for that reporting router. (This is the same polling procedure used for
non-broadcast links.)
Note that if a fragment is lost, a reporting router whose node-header is
in the lost fragment will of course remain unchanged in the recipient's
data base. Furthermore, any data received in subsequent fragments,
prior to a node-header, will also be meaningless. The last adjacency
of the last link in a reporting router's bulletin will have the Last
flag set to 1, signaling that following the address, a node header
follows.
IV.6. Routing update bulletin packets
A routing update bulletin packet (Table IV.1) may contain several
different reporting routers' updated link state information,
concatenated into one message, with a different sequence number for each
source (reporting router). One of the sources, of course, may be the
local router. Each router's link state information is further broken
down by link, which allows its backbone routing information to be
propogated separately from its local end node adjacency information.
Incremental changes (good news and bad news)
Bulletins that only report changes in state come in two flavors. Good
News bulletins inform other routers that an adjacency has been noted.
Bad News bulletins inform them that an adjacency has been lost. If an
end node establishes a connection with a router whose node group default
addresses (based on the significant bit count) already include that end
station, however, no bulletin need be sent out, as packets to that end
station will already be routed correctly. Theoretically, a router could
send out a new full routing update bulletin every time it gained or lost
an adjacency. However, the use of shorter Good News and Bad News
packets, which do not contain a full routing update, allow the network
to remain relatively current with less transmitted traffic.
Good news and bad news packets are propogated like other packets,
but are not originated by the same rules. While full routing bulletins
are originated based upon a timer, good news packets are transmitted
immediately. This enables new links to be useful quickly. Bad news,
however, should not travel as fast: A node should cache any bad
news message for a time (initial recommendation for this timer: 60
seconds) while waiting for the link to come back up. This helps keep
the network stable; if the node receives a packet destined for the
lost destination, it may send an ICMP "host unreachable" message
to the originator of the packet, unless it can reroute the packet
itself (as may happen with the loss of a backbone link where others
exist).
Because good news and bad news messages represent changes to the last
full link state bulletin propogated, but do not purport to completely
represent a node's link states, they carry bulletin subsequence
numbers. These use the last bulletin sequence number originated by the
reporting router, but the sub-sequence value increments from 1. (A full
link state packet has a sub-sequence value of 0, and resets the
subsequence counter.)
Routes to nearby destinations
Sometimes more than one router will serve the same area (determined by
the significant bits' matching), and they will need to know which one has
the better path to a given station. These adjacency messages may only
require a short horizon, as will Good News and Bad News packets which
refer to end nodes going on and off the air. Higher horizons are
needed for backbone routers.
Table IV.1. Full routing update (link state packet) bulletin:
1 2
|0 |8 |6 |4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ----
| RSPF Version #| Type | fragment # | fragment total| packet
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -hdr
| Checksum | sync octet | # nodes below |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ----
| Reporting node #1 full IP Address | node
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -hdr
| Node 1 bulletin sequence # | sub-sequence #| # links |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ----
| horizon left | ERP factor | link cost | #adjacencies | link
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ _-hdr_
|significant bits|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Adjacent node(s) 1,1,1 IP address | adj.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|significant bits|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Adjacent node(s) 1,1,2 IP address | adj.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|significant bits|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Adjacent node(s) 1,1,n IP address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| horizon left | ERP factor | link cost | #adjacencies | link
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|significant bits|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Adjacent node(s) 1,2,1 IP address | adj.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|significant bits|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Adjacent node(s) 1,2,n IP address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reporting node #2 full IP Address | node
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Node 2 bulletin sequence # | sub-sequence #| # links |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| horizon left | ERP factor | link cost | #adjacencies |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|significant bits|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Adjacent node(s) 2,1,1 IP address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|significant bits|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Adjacent node(s) 2,1,2 IP address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| horizon left | ERP factor | link cost | #adjacencies |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|significant bits|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Adjacent node(s) 2,2,1 IP address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|significant bits|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Adjacent node(s) 2,2,n IP address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reporting node #n full IP address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Node n bulletin sequence # | sub-sequence #| # links |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
etc.
Parameters--
An RSPF bulletin packet is sent within IP with a type of <tbd>.
Each routing update packet contains a packet header that contains:
RSPF Version Number: Version number of the protocol (initially 1).
Type: (Value 1 for Full Routing Update, 2 for Partial Routing
Update.)
Fragment Number: States which fragment, in a segmented message,
this is, beginning at 1. Non-fragmented messages use 1.
Fragment total: Total number of fragments in message; 1 if not
fragmented.
Checksum: IP-style checksum.
Sync octet: Which octet in this packet (counting from this
byte as byte 0) is the beginning of a node-header. If 0,
this fragment has no node-header. Non-fragmented messages
use a value of 2 (because one byte follows in packet header).
Number of nodes reporting: The number of reporting routers in the
messages that follows (this packet or a sequence of packets).
The node-header (for each reporting router) contains 8 octets:
Reporting router #n full IP address: The IP address of the router
whose adjacencies are being reported below.
Bulletin sequence number: When a bulletin is passed along, this
number is NOT changed; every new bulletin from a given Reporting
router has a value 1 higher than the previous bulletin from that
reporting router. (Note: This is modulo 65536, so implementations
must cope with the "wrap around" and consider values below, say,
100, to be "higher" than values above, say, 65400. Between 100
and 65400, modular arithmetic is NOT used.)
Sub-sequence number: Good news and bad news packets representing
incremental changes from the last full report increment this value
by 1; it is 0 for full bulletins.
# links: The number of different cost-horizon values (typically,
but not necessarily, representing different types of link in a
mulitiport gateway) being reported below; the following four octets
are the header for each link.
[For each reporting router, adjacencies are reported separately by
link cost. This is the received cost value for that data link, after
any adjustment that may be based upon packet loss ratio. Adjacencies
are also reported separately by horizon, even if the cost is the same.
It does not matter at this point if there are multiple RF or other
links if their cost and horizon are the same. Likewise, separate
received costs or horizons on one link will be treated separately
and such adjacencies will be grouped under separate link headers:]
Horizon left: This number is decremented every time a routing update
bulletin is passed along; when it reaches 0, it is not passed further.
Link cost: A "figure of merit" for each link, rising with slower/poorer
links. This is the number whose total is minimized by SPF. The range
is 1-127. As a starting point, a 56000 bps fdx backbone link might have
a value of 1, a 4800bps hdx link a value of 4, a 1200bps hdx link a
value of 8 and a 300 bps hf "wormhole" a value of 16. A value of 255
denotes the loss of a link; this is found in Bad News packets. These
values should be coordinated network-wide; adjusting them will change
the way packets are routed by RSPF.
Number of adjacencies: The number of adjacencies to be listed for that
reporting node.
ERP Factor: Used for "approximating" a route; contains the number of
significant bits for which a given node can be presumed to be a valid
router, even if it doesn't report in detail. A low factor = wider
coverage area; thus ERP of 16 means that if NO other match is found for
a given address, this router will try to handle it if it matches 16
bits. Basically a handle for future enhancements; its use will not
be required in the initial release of RSPF.
For each adjacency of the given link cost, the following is provided:
Significant bits: The number of bits used for node group routing
purposes. Usually 32 but may be lower if a given link purports to
serve all end nodes in an area defined using the most-matched-bits
node group convention. Higher numbers of bits matched take a higher
priority than least cost. This uses the low-order 5 bits of the octet.
Last-flag: If this is the last adjacency in the list for this
reporting router, this value is 1; otherwise it is 0. (This
occupies the high-order bit of the significant bits octet.)
Full IP address: The full IP address for this node.
Note that the n,n,n vector within the bulletin has three fields in the
above representation: Reporting router within bulletin packet, link
cost/horizon within reporting router, and reporting adjacency with that
link cost/horizon.
IV.7. Fragmentation
In a moderate to large network, a full routing update can easily exceed
the maximum size of an AX.25 frame or IP packet. The RSPF Routing
Update message, however, may be sent in fragments. No specific protocol
is required for this; bulletins are not assumed to be terminated by a
packet boundary. Each fragment is, however, numbered in the packet
header, along with an indication of the number of fragments to be
sent.
In order to permit parsing of the incoming fragments by routers who
are using unacknowledged broadcast information (with the high
likelihood of lost fragments), every bulletin's packet header contains a
sync octet indicator. This indicates which byte, where the next byte in
the header is byte 1, is the beginning of a node header. If a previous
fragment was lost, the receiver should ignore the number of bytes
indicated in the sync octet before resuming parsing of the packet. (If
the fragment does not exceed 255 bytes, this will always be sufficient.
It is recognized that long packets may not be able to use this mechanism
reliably, and a value of "0" should be used to indicate that no sync is
possible within this fragment.)
Each routing update bulletin takes the form of a three- dimensional
list, with the dimensions being reporting router, link and adjacency. A
given bulletin may report on link state from one or more remote nodes,
as well as the sending node. Each node may have one or more links; each
link may have one or more adjacencies.
Packets may not be fragmented within adjacencies, but may be
fragmented after an adjacency's address and before the next adjacency's
significant bits field. The next fragment, in a new packet, simply
begins where the last one left off; the receiver knows how many more to
expect based upon the node and link count in the respective node-header
and link-header.
A router may not start sending a new Routing Update message of any kind
to any given IP address until all fragments of a previous message have
been transmitted.
IV.8. Bulletin Timers
The timers used for bulletin updates must be a compromise between
maintaing the network's current state and the traffic required to do
so. An initial suggestion: Good news messages should be initiated
within a few seconds and bad news messages should be initiated within
one minute, with relatively short horizons on the bulletins (i.e., the
routers within the region). Full routing updates with normal horizons
should be sent out every 30 minutes. If the network is small, more
frequent updates may be possible; too frequent updates risk choking
the network with update traffic.
V. The Shortest Path First spanning tree algorithm
As a routing node comes onto the network, it acquires over time a
complete list of adjacencies between other nodes on the network except
as limited by the "horizon". Each adjacency (point to point link) has a
"cost" associated with it. It should be noted that adjacencies, for the
purposes of this algorithm, are "logical" and not necessarily physical;
if it occurs below IP (as in AX.25 digipieating or NET/ROM), the two
ends of the link are still adjacent. (Adjacency at the IP internet
layer is based upon subnetworks, which may contain their own links.)
Cost is set for the transmit side of each link; if the receiver and
transmitter do not agree on cost, the network may apply different routes
for packets going in opposite directions between the same two end nodes.
(This is not a problem. In a radio environment, one should NOT assume
reciprocity across a link.)
Each router builds a _link state table_ that lists, for every known
link (from every reporting router), the two ends and the cost of that
end of the link. The ends are listed by an IP address and, for the
destination IP address, a number of significant bits. This is what's
updated by the bulletins and by good news/bad news messages.
Source IP address Dest. IP addr/bits Cost
44.56.0.44 44.56.0.128/32 5
44.56.0.44 44.56.0.12/25 10
The goal of the algorithm is to build a _paths table_ which lists, for
every reachable node (or node group approximation of fewer than 32
bits) on the network, that node address (or node group address and
number of significant bits), the adjacent node used to get there (i.e.,
where you blast the packets next), and the total cost of the path.
(This feeds the Route table in the IP Route module in NET.)
Every router contains, for the purposes of building the tree, a
_trial table_ that has three entries: Destination address/bits,
adjacent node, and cost of this path. The paths table additionally
has one more entry, the _parent_ node, which is the last hop
before the destination. Thus in a path A -> B -> C -> D -> E, home
router A views E as the destination, D as the parent, and B as the
adjacency. Note that in the path from A to B, A is the parent node.
Begin building the paths table by using the home router as the first
node on the paths table. The cost to reach yourself is always 0, so
make the first entry on the paths table as follows: Source=self,
destination=self, parent=self, and cost=0. From here on in, parent
is always (by definition of the SPF spanning tree) the node most
recently added to the paths table.
Destination Adjacent Parent Cost
44.56.0.128 44.56.0.128 44.56.0.44 5
44.56.0.131 44.56.0.128 44.56.0.128 10
44.56.0.200 44.56.0.128 44.56.0.131 15
Paths Table showing relationship between
destination, parent and adjacent nodes, where the home
node is 44.56.0.44 and 44.56.0.200 is three hops away;
all hops here have a cost of 5.
SCAN_THE_LINKS:
The home router now scans its links table looking for all nodes (routers
and end nodes) that are adjacent to the parent node, except for links to
nodes which are already on the paths table. (It is generally fastest to
build the paths table by scanning the links table in order of increasing
link cost.) Each of these new nodes is added to the trial table, except
for the parent node (which is already on the paths table, so it can't
possibly have a shorter path). The value of cost used on the trial table
is the cost of the link from the parent to the destination plus the cost
to the parent node (taken from the paths table).
A node may only appear once in the trial table at any given time. If
an adjacency is found to a node that is already on the trial table, a
new entry replaces the existing entry if and only if the new total
cost is lower. If the cost is higher, it is ignored. (If the costs
are equal, then a "tiebreaker" is determined by treating the
lower-numbered IP address as the lower cost. This will simply make
the spanning tree more deterministic in case of tie.) Thus the trial
table always contains the lowest cost path to each destination found so
far.
Once all of the links from the parent node have had their chance to go
onto the trial table, scan trial table for the _one_ entry with the
lowest total cost. In case of tie, pick the lower IP address (again
just to be deterministic). Move this node to the paths table;
guaranteed, there'll be no cheaper path to that node! The adjacency
used for that node in the paths table is the adjacency to its parent
node. Note that the parent _must_ already be on the paths table since
that's the source of the parent; you're working your way outward.
This new addition to the paths table becomes the new parent node.
Repeast the procedure from SCAN_THE_LINKS above until there are no nodes
left on the trial table. This means you've completed the spanning tree
and have a least-cost path to every other node.
One of the router parameters is maximum_cost. If the cost to a given
parent node exceeds this value, the procedure stops, since all
subsequent entries in the route table will have a higher cost. This
value relates to the time-to-live parameter of the IP packet: It makes
little sense to compute a routing table to nodes that cannot be reached
within time-to-live. (Ideally, ttl will be implemented using a timer
rather than a node counter, but this is difficult to do with some of the
packet radio data link controller implementations; vis., TNCs.)
A router should recalculate its routes periodically based upon the
current links table. How frequently depends upon the CPU load involved.
Initial estimates are that it should be recalculated after receipt of
every routing update bulletin: Partial calculations do not save
enough CPU time to make them worthwhile.
V.2. Filling in the NET routing table
The route table in NET (the KA9Q et al implementation of IP) contains,
for each entry, the destination address, number of bits, interface,
gateway and metric. This is essentially left intact, except that metric
is filled in by cost and the routing decision looks for the least cost
of all matches. The route table is filled in from the paths table.
Since the routing table will be periodically recalculated from
scratch, implementation may require two route tables, one "in
progress" and one "in service". When the route calculation is
complete, the "in progress" table becomes "in service" and the old one
is cleared for reuse. This allows packet forwarding to continue while
the completed paths table is being converted into the route table.
Appendix I. Router parameters
Every router must set a number of parameters in order to properly
operate. While RSPF builds its routing matrix automatically, overall
network efficiency and stability may require some fine-tuning based
upon experience. These include parameters set for each data link
layer entity (i.e., each radio channel) and for the router in general.
Link settings:
Set Link cost: This is the cost parameter based upon the link speed
and type. Since the overall cost of the end-to-end path is
minimized by the RSPF spanning tree, link cost should be set to
arrive at the best overall network performance. The legal range
is 1-127. This is sent in routing update bulletins.
Node settings:
Add/Delete Node group: [IPaddr]/bits {cost}. This allows a node to
announce its ability to serve a group of nodes, identified using
the standard NET convention of address/significant bits. Thus a
node group setting of [44.56.0.1]/25 will match all nodes from
[44.56.0.1] to [44.56.0.127]. Cost is optional; if set, this
cost to will be propogated to reach such nodes; otherwise, the
link's default is used.
Set horizon link: This sets the horizon value for the node's
routing bulletins apropos 32-bit addresses of other connected
routers. This should be high enough to propogate across the
backbone.
Set horizon group: This sets the horizon value for the node's
routing bulletins apropos node group addresses (fewer than 32
bits matched). Usually matches the horizon link value.
Set horizon local: This sets the horizon vaue for the node's
routing bulletins apropos full link addresses (32 bits) within
the node group area. This is set to a low value so that only
other routers serving the same or overlapping node group(s)
will receive these messages.
Set horizon portable: This sets the horizon value for the
node's routing bulletins apropos full link addresses (32 bits)
not within a node group. This allows portable end nodes to
have their location in the network propogated farther than
the local horizon; this is usually set the same as horizon group.